Dependency Parsing with Reference to Slovene, Spanish and Swedish
نویسندگان
چکیده
We describe a parser used in the CoNLL 2006 Shared Task, “Multingual Dependency Parsing.” The parser first identifies syntactic dependencies and then labels those dependencies using a maximum entropy classifier. We consider the impact of feature engineering and the choice of machine learning algorithm, with particular focus on Slovene, Spanish and Swedish.
منابع مشابه
Intraclausal Coordination and Clause Detection as a Preprocessing Step to Dependency Parsing
The impact of clause and intraclausal coordination detection to dependency parsing of Slovene is examined. New methods based on machine learning and heuristic rules are proposed for clause and intraclausal coordination detection. They were included in a new dependency parsing algorithm, PACID. For evaluation, Slovene dependency treebank was used. At parsing, 6.4% and 9.2 % relative error reduct...
متن کاملUniversal Dependency Annotation for Multilingual Parsing
We present a new collection of treebanks with homogeneous syntactic dependency annotation for six languages: German, English, Swedish, Spanish, French and Korean. To show the usefulness of such a resource, we present a case study of crosslingual transfer parsing with more reliable evaluation than has been possible before. This ‘universal’ treebank is made freely available in order to facilitate...
متن کاملParsing With Clause and Intra-clausal Coordination Detection
We present a new dependency parsing algorithm based on the decomposition of large sentences into smaller units such as clauses and intraclausal coordinations. For the identification of these units, new methods combining machine learning techniques and heuristic rules were developed. The algorithm was evaluated on the Slovene dependency treebank text corpus. Compared to the MSTP parser, currentl...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملSlovene-Croatian Treebank Transfer Using Bilingual Lexicon Improves Croatian Dependency Parsing
A method is presented for transferring dependency treebanks between similar languages by using a bilingual lexicon, aiming to improve dependency parsing accuracy on the target language. It is illustrated by transferring the Slovene Dependency Treebank to Croatian by using a GIZA++ bilingual lexicon constructed from the Croatian-Slovene 1984 parallel corpus from the Multext East project. The tra...
متن کامل